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Abstract 

The transition between a regime in which thermodynamic relations apply only to ensembles 
of small systems coupled to a large environment and a regime in which they can be used to 
characterize individual macroscopic systems is analyzed in terms of the change in behavior of the 
Jarzynski estimator of equilibrium free energy differences from nonequilibrium work measurements. 
Given a fixed number of measurements, the Jarzynski estimator is unbiased for sufficiently small 
systems. In these systems the directionality of time is poorly defined and the configurations that 
dominate the empirical average, but which are in fact typical of the reverse process, are sufficiently 
well sampled. As the system size increases the arrow of time becomes better defined. The dominant 
atypical fluctuations become rare and eventually cannot be sampled with the limited resources that 
are available. Asymptotically, only typical work values are measured. The Jarzynski estimator 
becomes maximally biased and approaches the exponential of minus the average work, which is the 
result that is expected from standard macroscopic thermodynamics. In the proper scaling limit, 
this regime change has been recently described in terms of a phase transition in variants of the 
random energy model (REM). In this paper, this correspondence is further demonstrated in two 
examples of physical interest: the sudden compression of an ideal gas and adiabatic quasi-static 
volume changes in a dilute real gas. 
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I. INTRODUCTION 



The laws of thermodynamics summarize empirical observations about the approximate 
and most probable behavior of macroscopic systems [1-4]. They are formulated under the 
assumption of limited resources in the measurements of the quantities involved. In the 
words of Gibbs, thermodynamics laws "express the laws of mechanics for [systems of a great 
number of particles] as they appear to beings who have not the fineness of perception to 
enable them to appreciate quantities of the order of magnitude of those which relate to 
single particles, and who cannot repeat their experiments often enough to obtain any but 
the most probable results" [1]. This statement summarizes the conditions under which a 
proper thermodynamic description for a single system can be made: (i) the system has a 
large number of degrees of freedom, (ii) the quantities of interest involve averages over space 
and time on scales that are large compared to the corresponding molecular scales, (iii) there 
are limitations in the time span of the measurements, and (iv) there are limitations in the 
number of measurements made. For instance, the elimination of condition (iii) gives rise to 
the objection formulated by Zermelo, based on the fact that any isolated mechanical system 
will come arbitrarily close to its initial state provided that a sufficiently long period of time 
elapses (Poincare recurrences) [3-5] . If the system is ergodic, averages over phase-space can 
be replaced by time averages and conditions (iii) and (iv) are equivalent. 

Improvements in measurement devices and techniques, an increasing interest in smaller 
scale systems in biology, physics and chemistry [6-11] and developments in dynamical sys- 
tems and chaos theory [12] have prompted numerous researchers to develop interpretations 
and extensions of thermodynamics applicable to systems with small numbers of degrees of 
freedom [8, 13-17]. In such systems it is not possible to uphold the standard interpretation 
of the thermodynamic quantities and of their relations. In particular, the measurements are 
made on molecular scales and produce values that are dominated by fluctuations. Nonethe- 
less, reproducible results are obtained by performing averages over many independent mea- 
surements of small systems at equilibrium (e.g., in contact with a heat bath). Therefore, a 
proper thermodynamic description is recovered if the ensemble method, which was originally 
devised as an operational construct to obtain results for single macroscopic systems, is given 
a literal interpretation: Thermodynamic quantities are identified with ensemble averages, 
which are experimentally realized as averages over independent measurements under equi- 
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librium conditions. Note that the condition of equihbrium requires that the small system 
be in contact with some other system with a large number of degrees of freedom (e.g. a 
heat reservoir for the canonical ensemble), so that correct results are obtained from the as- 
sumption that the initial state in a particular realization of the process can be treated as an 
independent sample from the appropriate ensemble. Thus, conditions (i)-(iv) are required 
for the whole and are essential for a proper thermodynamic description of the small system 
as well. 

In this work we investigate the transition between the regime in which the thermodynamic 
description is valid for a single system and the regime in which thermodynamic quantities 
need to be understood as averages over realizations of a given process. To this end we 
analyze the behavior of the empirical estimates of equilibrium free energy differences from 
nonequilibrium work measurements by means of the Jarzynski equality. In contrast to 
standard thermodynamic relations, the ensemble average that appears in the Jarzynski 
equality is dominated by rare extreme fluctuations [18]. Consequently, to obtain accurate 
estimations of the free energy change, one needs to perform repeated work measurements 
in the nonequilibrium process. The number of measurements needed to obtain meaningful 
estimates increases exponentially with the size of the system [18, 19]. Therefore, the average 
that appears in the Jarzynski equality cannot be realized in the macroscopic regime, in 
which definite values of thermodynamic quantities can be ascribed to single systems rather 
than to collections of systems. 

For limited resources, when the number of measurements is fixed, the Jarzynski estimator 
of free energy differences becomes biased as the size of the system increases. In the appropri- 
ate scaling limit, the appearance of the bias is abrupt and corresponds to a phase transition 
in variants of the random energy model (REM) [20-24]. The connection between the random 
energy model and the Jarzynski estimator of free energy differences was made in [25-27] . In 
those articles, expressions of the free energy in the low-temperature (small M limit, where 
M is the number of nonequilibrium work measurements) and in the high-temperature phases 
(large M limit) of the random energy model, including finite M corrections, were derived 
for a parametric family of work distributions. In another recent work, the convergence of 
Monte Carlo estimates in terms of the random energy model has been made in connection 
with the 'sign problem' [28]. 

In the current paper we further establish the correspondence between the Jarzynski es- 
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timator of free energy differences and variants of the random energy model for the sudden 
compression of an ideal gas and for adiabatic quasi-static volume changes in a dilute real gas. 
Even though this correspondence is explicitly established for only a few particular physical 
systems, it is expected to obtain in more general situations. The origin of the phase tran- 
sitions in sums of random exponentials is the interplay between classic limit theorems for 
sums (e.g., the central limit theorem, the law of large numbers) and extreme value statistics 
[29, 30]. A contribution of this paper is to highhght the importance of the phase transition 
in the Jarzynski estimator of free energy differences to signal the change between two dis- 
tinct regimes wherein the ensemble and the single-system interpretations of thermodynamic 
quantities are applicable, respectively. 

The article is organized as follows: Section 11 introduces the Jarzynski equality and 
related concepts that are necessary for the subsequent study. The use of this equality 
in the estimation of equilibrium free energy differences from nonequilibrium work values 
is described in Section III. Section IV analyzes the change in behavior of the Jarzynski 
estimator as a function of system size and of the number of measurements performed in three 
illustrative examples. Finally, Section V discusses how this change in behavior corresponds 
to the emergence of classical macroscopic thermodynamics and a well-defined arrow of time 
as the size of the system is increased, when the number of measurements performed is fixed. 
The technical details of the analysis of the Jarzynski estimator in terms of variants of the 
random energy model are deferred to the appendices. 

II. THE JARZYNSKI EQUALITY 

Consider a system characterized by a Hamiltonian H(r;X), where F denotes a point in 
phase space. The parameter A can be modified by external manipulation to perform or 
extract work from the system. If the system is coupled to other degrees of freedom, the 
changes in A are assumed to affect only the Hamiltonian of the system of interest and not 
the terms that describe its interaction with the environment or the environment itself. 

Assume that the parameter A is modified in the interval [0, r] according to a specified 
protocol {X{t); < t < r} from A(0) = Aq to A(t) = Ai. It is possible to show that the work 
associated with the transition between an initial configuration of the system sampled from 
an equilibrium distribution with Hamiltonian i?o(r) = H{r; Aq) at temperature and a 
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final configuration corresponding to the Hamiltonian i?i(r) = H(r; Ai) is a random variable 
W that satisfies the equality [31, 32] 

(e-/'^> = e-^^^, AF = Fi - Fq. (1) 

The average is performed over trajectories whose initial point in phase space is sampled from 
a canonical distribution 

g-/3ifo(r) 

where the integral is over Q, the region of phase space accessible to the system. This region 
is assumed to remain unchanged during the process. Note that the state in which the system 
finds itself after the manipulation has been completed need not be an equilibrium one. In 
general, Fi is not the free energy of such a state. Therefore, the free energy difference that 
appears on the right-hand side of (1) 

Fi = -p-^ log / dTe-^^'^^\ i = 0, 1, (3) 

need not coincide with the actual free energy change in the process described [32, 33]. It is 
the difference between the free-energies of the system in equilibrium at temperature f3~^ in 
two states characterized by Hamiltonians Hq and Hi, respectively. If the system is strongly 
coupled with an environment that acts as a heat reservoir, H(r; A) is the potential of mean 
force associated with the variables of the system of interest [32, 34]. If the system is isolated 
or weakly coupled with its environment during the external manipulation, H{r; A) can be 
identified with the Hamiltonian of the isolated system [31]. 



III. ESTIMATION OF FREE ENERGY DIFFERENCES FROM NONEQUILIB- 
RIUM WORK MEASUREMENTS 

Since it was derived the Jarzynski equality has attracted a fair amount of interest because 
it allows the computation of equilibrium free energy differences from measurements of work 
in general nonequilibrium processes 

AF = -^-Mog(e-^^>, (4) 

where the angular brackets denote an average over the canonical ensemble at temperature 
f3~^ with respect to the initial Hamiltoninan Hq. However, this equality differs in crucial 



aspects from standard thermodynamic relations. Because of the molecular nature of the 
systems analyzed, thermodynamic quantities (energy density, number density, temperature, 
pressure, work, etc.) are fluctuating variables. In macroscopic systems, the fluctuations are 
small. This allows us to identify these random thermodynamic quantities with their typical 
values, which are deterministic and can be measured in single experiments. Repetitions of 
these experiments yield values that are indistinguishable, within the error of the macroscopic 
measurement. Therefore, for standard thermodynamic quantities, typical and average values 
are close to each other and can be used interchangeably to characterize the macroscopic state 
of the system under study. 

In contrast, the average that appears in the Jarzynski equality needs to be interpreted as 
a true ensemble average [31]. The work measured in a particular realization of a nonequilib- 
rium process depends on the initial microscopic configuration of the system and is therefore 
a fluctuating quantity. In each of these measurements, the system is prepared in an initial 
state sampled from the equilibrium distribution. One then carries out the intervention that 
gives rise to the nonequilibrium process. The work involved in this process is then recorded. 
Finally, to extract equilibrium free energy differences from these measurements, the average 
of the exponential of minus these work values is computed. Because of the exponential 
form of the summed quantities, typical and average values are, in the general case, very 
different. Unlike for standard thermodynamic relations, plugging in the typical value of the 
work on the left-hand side of (1) does not fulfill the equality. The reason is that the average 
is dominated by extreme events whose probability of occurrence is very low. Therefore, a 
sufficiently large number of measurements is needed so that these rare events arc well repre- 
sented and the equality can be empirically realized. The number of measurements required 
to provide a meaningful estimate of the average increases exponentially with the size of the 
system [18, 19, 35]. Therefore, it is not practicable in macroscopic systems. Nonetheless, 
the regime in which the equality can be experimentally verified is accessible for sufficiently 
small systems. 

To realize the average, repeated experiments under the same conditions are carried out. 
The values of work obtained in each of these experiments are recorded and used to estimate 
the average 

m=l 
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By means of the Jarzynski equality, this Monte Carlo average can be used to estimate the 
change in free energy 

AFM = -rMog(e-n^. (6) 

Since the particular realization of work values {Wm}^=i is random, the estimate AFm is 
also a random variable. Our goal is to understand the properties of this random variable as 
a function of M, the number of nonequilibrium work measurements and N, the size of the 
system. 

Because of the exponential form of the quantity averaged, the role of extreme fluctuations 
is very important. In contrast to usual thermodynamic averages, such as (W), the average 
work, which are dominated by configurations that arc typical of the initial (equilibrium) 
state of the system, the average (^e~^^) is actually dominated by rare configurations that are 
typical of the system at equilibrium at temperature with respect to the final Hamiltonian 
[18]. 

Of particular interest is the block [36, 37] or quenched [38] average 

E[AFM] = -r^E[log(e-^^>J, (7) 

where the expectation E [•] is with respect to independent realizations of M measurements, 
each of which corresponds to an independent sample from an initial equilibrium canonical 
distribution at temperature /3~^. In [36, 37] E [AFm] is referred to as the finite-data average 
free energy. This quantity is an estimator of AF, albeit a biased one, in general. The bias 
is the difference between the expected value of this estimator and the actual value of the 
free energy 

Bm = E [AFm] - AF. (8) 

Using the law of large numbers it is possible to show that in the limit M ^ oo the block 
average converges to the free energy change 

lim E [AFm] = AF. (9) 

Therefore, the estimator E [AFm] is asymptotically unbiased 

lim Bm = 0. (10) 

For a single experiment M — 1 the estimator equals the average work performed on the 
system 

E[AFi] = (W^). (11) 



Following [19] , we define the 'dissipated work' in a given realization of the experiment as 



Wais ^W-i^F. (12) 

Using Jensen's inequality it is possible to show that E [AFi] is a positively biased estimator 
of AF 

{W) = E [AFi] > AF. (13) 
In fact, the convergence of E [AFm] to the asymptotic hmit AF is monotonic [37] 

{W) = E [AFi] > E [AFm] > E [AFm+i] > E [AFoo] = AF, 1< M < oo. (14) 

In terms of the bias 

{Wdis) = {W) -AF^Bi>Bm> Bm+1 > 5oo = 0, 1< M < oo. (15) 

Therefore, the maximum bias corresponds to M = 1 and coincides with the average dissi- 
pated work 

S^ax = Si = {Wais) . (16) 

The main contribution of this research is to explicitly show in several paradigmatic cases 
that the finite sample estimate of free energy differences from the Jarzynski equality for 
a particular system exhibits a change of behavior as M, the number of repetitions of the 
experiment, increases. Alternatively, for a fixed number of measurements, the regime change 
occurs as a function of A^, the system size. For small systems the Jarzynski estimator is 
unbiased. In these systems the nonequilibrium work measurements are dominated by fiuc- 
tuations. Configurations that are typical of the reversed process are well sampled, which 
means that the arrow of time is poorly defined. As the system size increases, the probability 
of sampling these configurations becomes exponentially small, so that they are not observed 
in practice. The Jarzynski estimator of the free energy change becomes biased and asymp- 
totically approaches the value of the average work. The suppression of these fluctuations 
also leads to the emergence of a well-defined arrow of time in the system. In the limit 
M — >■ oo, A^ — >■ oo with log M/N — >■ constant the regime change is akin to a phase transition 
that arises in simplified models of spin-glasses, such as the random energy model in both its 
continuous [20, 21] and discrete [22, 24, 39] versions. 
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IV. PHASE TRANSITION IN THE JARZYNSKI ESTIMATOR OF FREE EN- 
ERGY DIFFERENCES 



We now proceed to analyze the behavior of the Jarzynski estimator in three important 
cases. The first one corresponds to processes in which the nonequihbrium work distribution 
is Gaussian [19]. This is a particular case of the class of work distributions analyzed in 
[27] with 5 = 2. The second case is a compression experiment for an ideal gas [33, 35, 40]. 
Finally, we consider adiabatic and quasi-static volume changes for a dilute classical gas of 
interacting particles [41]. 

The change of regime is best analyzed in terms of the normalized bias 
~ _ Bm _ E[AFm]-AF ~ 

where the maximum bias is the difference between the average work in the actual nonequi- 
hbrium process (which is not necessarily isothermal) and the free energy difference in the 
corresponding isothermal process 

Smax = {W) - AF. (18) 

As a function of M, the normalized bias is a monotonically decreasing quantity of M, which 
is bounded between and 1 

1 = Bi>Bm> Bm+1 > = 0, 1 < M < oo. (19) 

To simplify the derivations we assume /3 = 1. It is straightforward to reintroduce this 
parameter in the final expressions by noting that setting (5 = 1 is equivalent to measuring 
energies in units of f3~^ = ksT, where ks is the Boltzmann constant and T is the initial 
equilibrium temperature. 



A. Gaussian work distribution 



In this section, we illustrate the connection between the Jarzynski free energy difference 
estimator and the random energy model for a Gaussian work distribution. The results pre- 
sented in this section were first derived in [27]. That reference gives explicit expressions 
for the bias of the Jarzynski estimator in different regimes for a general class of work dis- 
tributions, which includes the Gaussian as a particular case. It also considers finite-size 
corrections, which are ignored in our analysis. 
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Assume that in the sample estimate 

m=l 

the work values {VF^}^^^ follow a normal distribution whose mean is < W >, and whose 
variance is cr^ [19]. In this case, 

AF = -log(e-^> = W-^a^ (21) 

Therefore, the maximum bias is 

Smax = {W) -AF= (22) 

This is an extensive quantity and scales with the size of the system. In the limit u ^ oo, 

M — )■ oo and log M/a^ finite, the estimate of the free energy has an abrupt change of 
behavior (see appendix A) 

E[AF„lJW-^^- M>exp{l.=} ^^^^ 

[ {W) - V2aV]ogM + logM, M < exp {^a^} . 

These expressions correspond to those derived in [27] (Eq. (4) and the paragraph before 
this equation in that reference) for 5 = 2, = 2(j^, N — M, and D^ — jl. 
For a fixed value of u the normalized bias is 



5m J 0, M > Me 

= < / ^ n2 (24) 

±}r... I h - yiog M/ log Mc J , M < Mc 



max 



with Mc = exp{i(T2}. 

Figure 1 displays the dependence of the normahzed bias (24) as a function of ^/\og^^/]ogM^ 
with a fixed a, for different values of a. The curve corresponding to the asymptotic limit 
(T — > oo is plotted as a dash-dotted line. The remaining curves are averages over Monte 
Carlo simulations. For small numbers of measurements (M < Mc), the Jarzynski estimator 
is biased. In this regime, the free energy exhibits strong (of order 1) non-Gaussian fluc- 
tuations around its average (Theorem 1.6 from [23]). These fluctuations are driven by the 
Poisson process of the extremes of the random nonequilibrium work measurements. The 
range M > Mc corresponds to a regime in which the estimate of the free energy change 
for sufficiently large a is unbiased. The bias persists beyond this limit for small systems: 
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when a is small, the transition between regimes is more gradual. This is the region in 
which one expects to observe convergence to the Jarzynski limit in experiments. There is 
a second phase transition in the system: in the range e^^/^ < M < e'^^ the fluctuations 
can be expressed in terms of the Poisson process of extremes of the noncquilibrium work 
measurements. Beyond the threshold = e'^^, the central limit theorem holds and the 
fluctuations around the block average are approximately Gaussian (Theorem 1.5 from [23]). 
The graph presented corresponds to Fig. 2(a) in [27], with — Q^/2. The main difference 
is the square root in the abscissae, which does not modify the point at which the phase 
transition occurs in the REM limit or the qualitative picture. 

Table I displays the number of measurements needed to obtain an unbiased estimate 
of the free energy using the Jarzynski estimator (Mc) and to reach a regime in which the 
fluctuations around this estimate are Gaussian (M^) for the several values of B^ax in the 
range explored in the experiments described in [27] (from ksT to 20kBT). 

The regime change can also be observed when the number of measurements is flxed and 
the size of the system, measured in terms of cr^, increases. For a flxed M, the normalized 
bias is 

-Dmax [ (1 - (^c/cr) , a > Gc 

where Uc — \/2\ogM. 

Figure 2 displays the curves that trace the dependence of the normalized bias (25) as a 
function of Xog-^Q^a/a^ with M fixed, for different values of M . The curve corresponding to 
the random energy model (M oo) is displayed as a dash-dotted line. In small systems 
(T < (Tc the estimate of the free energy difference is unbiased. For sufficiently small systems 
((7 — >■ 0), the bias scales as Bm ~ B^g^-^/M. Therefore, the bias is reduced by increasing M 

TABLE I. Number of measurements needed to obtain an unbiased estimate of the free energy 
(Mc) and to reach a regime in which the fluctuations around this estimate are Gaussian (M^) as a 
function of B^ax, when the nonequilibrium work distribution is Gaussian. 



Bfnax 




2kBT 


bksT 


lOksT 


20kBT 




3 


8 


149 


22,027 


485,165,196 


m: 


8 


55 


22,027 


485,165,196 


2.35 • lO^'^ 
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(log M / log M^f^ 



FIG. 1. Approach of the normahzed bias to the curve (24) as M — t- oo. The random energy model 
corresponds to the dash-dotted hne. 

linearly. This is the region in which one expects convergence of the sample average to the 
Jarzynski equality limit. Beyond that threshold (cr > Cc), the empirical estimate is biased. 
As the system size increases the bias approaches its maximum value, which corresponds to 
measuring the typical value in most of the nonequilibrium work measurements, in agreement 
with the expected behavior of classical macroscopic systems. In this regime, linear increases 
in M do not lead to significant changes in the bias observed. 

B. Ideal gas compression experiment 

Consider an isolated system consisting of non-interacting particles (ideal gas). The 
system is confined in the interval [—L, L] in the X direction. The macroscopic state of the 
system is defined by the temperature and the value of an externally applied potential. The 
microscopic state of the system is characterized by n, the number of particles in region II 
(0 < a; < L). Correspondingly, the number of number of particles in region I {—L < a; < 0) 
is A^ — n. 

Initially, the external potential is zero and the system is assumed to be in a homogeneous 
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FIG. 2. Approach of the normahzed bias to the curve (25) as M — t- oo. The random energy model 
corresponds to the dash-dotted curve. 

equilibrium state at temperature = 1. The probability of having n particles in region 
II in this state is 



1 fN^ 

p{n-Q) = —[ ), n = 0,l,...,iV. 



(26) 



This is a binomial distribution with the parameters (A^, 1/2). It is peaked around the mean 
?T,*(0) = N/2. Its standard deviation is /2. The corresponding free energy is 



0. 



(27) 



Consider a compression experiment during which the system undergoes a sudden transi- 
tion from the initial homogeneous equilibrium state to a state in which particles are more 
likely to be in region I because of the presence of a positive external potential in region II 



V{x-e) 



0, —L < X < [region I] 
e, < X < L [region II] 



(28) 



The equilibrium distribution at temperature /3 ^ = 1, when the system is subject to this 
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external potential is a binomial distribution with parameters {N,e ^/(1 + e ^)) 

P(n;e)^ ^_,,iv ^^V""^ n = 0,l,...,iV. (29) 



(1 + e-)^ 



This distribution is peaked around its mean n*{e) — N/{1 + e^). Its standard deviation is 
-\/]Ve~"^/^/ (1 + e~^). The corresponding free energy is 

N 



F^(e) = -log 



-y 



2N 

n=0 



N 



A^log . (30) 



The change in free energy between the initial state, in which the system is in equilibrium 
at temperature = 1, in the absence of a external potential, and an equilibrium state at 
the same temperature with potential V{x; e) is 

AF = F^ie) - F^(0) = iVlog-^. (31) 

1 + e ^ 

As noted by several authors, this is not the change of free energy between the equilibrium 
states corresponding to the initial and final values of the work parameter in the experiment 
that is being performed [32, 33]. In particular, the configuration of the particles in the 
system at t = 0^ is the same as in the initial state because there has not been any time 
to evolve. In most cases, this configuration is very atypical of the equilibrium state for the 
new constraints. In fact, the system does not have a well-defined temperature in this state. 
Nonetheless, using the equality 

AF=-log(e-^>, (32) 

it is possible to compute the change in free energy in an isothermal process (31) in terms of 
the external work performed on the system during the compression [40]. 

In practice, one needs to carry out a series of independent realizations of the experiment. 
In the mth realization, the configuration of the system is sampled from the equilibrium 
distribution rim ~ p(n;0). The external work needed to perform the transition in this 
particular realization is Wm = n^^. The average work over M realizations of the experiment 
is 

M M 

Wm^^^^E^-^mE^-^- (33) 

m=l m=l 

Since the {nm)m=i independent identically distributed random variables (iidrvs), (W) ^ 
is also a random variable whose average is the thermodynamic work 

1 ^ 1 
{W) = E [{W)^] = ^ E ^ [^-] e = E[n]e= -Ne. (34) 

m=l 
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The average and the typical value of the work performed coincide. 
Consider now the Jarzynski estimator of the free-energy difference 

AF„ = - log (e-"')„ = - log E e-"'~) = - log U f e""-) 

\ m=l / \ m=l / 

This is also a random variable whose average is 



(35) 



E[AFm] = -E [log(e-'^>J =-E 



/ ^ M 



-Wrr. 



m=l 



-E 



m=l 



For M = 1 this average coincides with the average work 



E [AFi] = (W) = -A^e. 
In the hmit M — > oo, it converges to the free energy change 



lim E [AFm] = AF = at log . 

M->oo 1 + e ^ 



(36) 



(37) 



(38) 



There is no closed-form expression for this average for other values of M. However, 

taking advantage of the correspondence between the ideal gas compression experiment and 
the discrete random energy model (see appendix C), it is possible to show that there is a 
continuous but abrupt change in the block average 



■ iVlog-^ 
E [AFm] - > 



e < ec(7) 



iV[7log2 + |(l-tanhf)] = iV [7log2 + ^] , e > 6^(7) 

in the limit M — >■ 00, N — >■ 00, with 

logM 



(39) 



7 



^0(7) = 



A^log2 

00, 



constant, 



7 > 1 



log [1 - h,\l - 7)] - log [h,\l - 7)] , 7 < 1 
The function h^^iy) e [0, 1/2] is the inverse of the binary entropy 

h2{x) = — xlog2X — (1 — x) log2(l — x). 

In this hmit the bias of E [AFm] as an estimator of AF is 



(40) 
(41) 



(42) 



Bm = E [AFm] - AF 



0, 



e < ec(7) 



[7 log 2 + I (1 - tanh f ) - log ^] , e > 6^(7) 



(43) 
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For a fixed value of e the bias is maximum for M = 1 (7 = 0, ec = 0) 



lOET 

2 ^ 1 + e- 



log cosh-. (44) 



The normalized bias is 



Bm 0, 7>7c(e) 



B{li^) D 1^-^ \ ^ f tanh^-7log2 _ ^ _ tanh ^ _ logcosh^ , ^ ' (^^) 

■ log cosh I ~ 2^ log cosh I ~ log cosh § ' ^ ^ Tci^j 

where 

7c(6) = 6,-^(6) = l-/i2(j^), (46) 

with 7c(0) = and limg^oo7c(e) = 1- 

The results of computer simulations of the ideal gas compression experiment at temper- 
ature I3~^ — 1 are depicted in figure 3. The graphs display, for different values of e, the 
dependence of the normalized bias as a function of system size, measured in terms of 7"^ 
with M fixed, for different values of M. The behavior observed is similar to the Gaussian 
case: the regime in which the Jarzynski equality can be realized (i.e. the Jarzynski free 
energy estimator is unbiased) corresponds to gases composed of a small number of particles. 
If the system size is increased, assuming that the number of measurements is kept fixed, a 
gradual change takes place to a regime in which the Jarzynski estimator becomes biased. 
Asymptotically, for large A^, the estimator becomes maximally biased, which means that 
only typical work values are observed. Linear increases in M do not significantly reduce 
this bias. Therefore, in this regime, it is not possible to empirically realize the Jarzynski 
equality and measurements yield standard macroscopic thermodynamic values. The change 
in regime becomes more abrupt as e, M and N increase and is asymptotically well described 
by the phase transition that takes place in the discrete random energy model. 

Table 11 displays the number of particles (A"c) at which the transition from a regime in 
which the Jarzynski estimator is unbiased (A^ < N^) to a regime in which the Jarzynski 
estimator is biased (A^ > Nc), for several values of M and e. Analyzing the results presented 
in this table one can see that the critical system size is rather small and increases rather 
slowly (logarithmically) with M, the number of work measurements performed. 

16 



C. Adiabatic quasi-static volume change in a dilute gas. 

Consider a dilute gas of N interacting particles in d dimensions. Assume that this gas 
undergoes an adiabatic quasi-static volume change from Vq to Vi. The work distribution in 
this process is 

1 /W\^^^ 

"("'^ = HTM ( « ) ^""'°''("^)- 

where K = Nd/2 and a = (Vq/Vi)'^^'^ — ! [41]. The Heaviside step function 9(q}V) guarantees 
that work is positive for compression (a > 0) and negative for expansion {a < 0). The 
average work is (W) = {Nd/2)a . The typical value can be identified with the mode of 
the distribution Wtyp — {Nd/2 — l)a for Nd/2 > 1. Note that for large systems {N ^ oo) 
typical and average work values are very similar. 

Assuming a fixed number of measurements M there is an abrupt change of behavior of 
the Jarzynski estimator as N, the size of the system, increases, for sufficiently large M and 
N (see Appendix B) 



E [AFm] = -E [log (e-^)^] = {W) + log M - E 

N^logil + a), N<N^ 
N^a{l + xi{N)) + logM, N>Nc 

where the system size at which the transition takes place is 



M 



e 

m=l 



(48) 



K^^^^ (49) 

d(log(l + a)-^) ^ ' 



TABLE II. Number of particles (Nc) at which the transition between the two regimes of the 
Jarzynski estimator occurs for different values of M, the number of work measurements performed, 
and e, the value of the external potential applied in the compression of an ideal gas. 



6 = 0.5 e = l e = 2 e = 5 
M = 10 76 21 8 4 
M = 100 152 42 15 8 
M = 1000 228 63 22 11 
M = 10,000 304 84 29 15 
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FIG. 3. Dependence of the normalized bias on N for e = 0.5 (top left), e = 1 (top right), e = 2 
(bottom left) and e = 5 (bottom right). The asymptotic value for the discrete random energy 
model corresponds to the dash-dotted curve in the plots. 



and xi{N) is the negative solution of 

xi{N) - log(l + xi{N)) = -1 < xi{N) < 0. 

This nonlinear equation has another solution at Xu{N) > 0. At the transition point 



xi{N, 



a 



1 + a 



(50) 



(51) 



The free energy difference for an isothermal volume change computed from the values of 
work measured in the adiabatic process is 



AF = lim E [AFm] = N^log{l + a) = N log ^. 

M— )-oo 2 Vi 

in units of (3~^ = ksT. 

The bias in the Jarzynski estimator of the free energy difference is 

0, N < Nr 



(52) 



B 



M 



iVf (a(l + xi) - log{l + a)) + log M, N > N, 



(53) 
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FIG. 4. Dependence of the normalized bias on for M = 10 (top left), M = 100 (top right), 
M = 1000 (bottom left) and M = 100,000 (bottom right). The dash-dotted curve corresponds to 
the discrete random energy model and gives the limiting behavior as M — )• oo. 

The maximum value of the bias corresponds to M = 1, which implies = = 0, A^^^ = 0, 
and 

Sn,ax = Si = {W) - AF = iV^ (a - log{l + a)) . (54) 
Therefore, the bias normalized by its maximum value is 

Bm f 0, N<N, 



B = = < - • - - C /ggX 

^ ° ^ , X;(l+a)-log(l+x,) JY^JY ■ ^ ^ 

a— log(l+o) ' 

The results obtained in simulations of the adiabatic compression of a dilute real gas of 
particles in c? = 3 dimensions from a volume Vq to a volume Vi = Vq/A are shown in 
Figure 4. The plots display the normalized bias as a function of log N/ log Nc for different 
values of M, the number of measurements. In contrast with the Gaussian case, the curve 
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for the random energy model changes, albeit slightly, as a function of M. The remaining 
curves in the plots correspond to averages over Monte Carlo simulations. The simulations 
cannot be performed for large values of N because of numerical overflows. For fixed values 
of M, one observes a transition from a regime in which the bias is close to zero, for small 
systems, to a region in which the Jarzynski estimator is biased, when the system size is above 
a threshold Nc- The value of Nc has a logarithmic dependence on M, which means that 
the change in regime occurs for fairly small systems, even when the number of experiments 
is rather large {Nc ~ 24 for M — 100,000). Beyond this threshold, the bias increases 
monotonically and asymptotically approaches its maximum. The regime in which the bias 
is maximal corresponds to the realm of classical thermodynamics. Within this regime the 
measured work can be identified with either the average {{Nd/2)a) or the typical value 
{{Nd/2 — l)a). Typical and average values of work, as well as the Jarzynski estimator in 
a finite number of measurements become indistinguishable as the number of gas particles 
increases. 

The change between these two regimes, which is smooth for small M, becomes more 
abrupt as M increases. The variant of the REM model analyzed in Appendix B provides a 
good qualitative description of the transition that becomes more accurate as M, N and the 
volume change become larger. 

V. DISCUSSION AND CONCLUSIONS 

There are two paradigms in which statistical mechanics has successfully bridged mechan- 
ical and thermodynamic laws in systems composed of large numbers of particles using a 
probabilistic description. On the one hand, individual macroscopic systems can be char- 
acterized by the values of thermodynamic quantities and their fiuctuations. In these large 
systems the distribution of each of the thermodynamic quantities is sharply peaked around 
its mode. Therefore, the fluctuations arc small and need to be probed by means of special 
experimental techniques (e.g. light scattering for mass density fluctuations) or in indirect 
measurements (e.g. using fluctuation-dissipation relations to analyze the relaxation of sys- 
tems removed from equilibrium) . The most probable value obtained in a single experiment 
agrees with the average over several experiments. On the other hand, in a small system 
coupled to a large number of degrees of freedom the standard thermodynamic description 
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is valid for the system plus environment considered as a whole. However, measurements 
of reduced properties associated with a small number of degrees of freedom are dominated 
by fluctuations. In consequence, thermodynamic relations cannot be used to describe indi- 
vidual measurements. Notwithstanding, they can be understood in terms of averages and 
fluctuations over repeated measurements. The coupling of the small system to the environ- 
ment provides a mechanism for the reduced properties to approach their equilibrium values 
as given by the corresponding ensemble average [42-45]. Each measurement can be un- 
derstood as sampling the initial state from the equilibrium distribution for the appropriate 
constraints (e.g. temperature, if the environment acts as a thermal reservoir). For stan- 
dard thermodynamic quantities, these measurements over ensembles of small systems yield 
typical and average values that are close to each other. In contrast, typical values obtained 
in individual measurements of work in nonequilibrium processes do not fulfill the Jarzynski 
equality. In this equality, the average of the exponential of minus the work is computed by 
taking the mean of repeated independent measurements in systems prepared at equilibrium. 
By the law of large numbers, this Monte Carlo estimate converges to the expected value in 
the limit of an infinite number of measurements. However, the sum is dominated by rare 
events that need to be sufficiently well sampled so that the Monte Carlo estimate is close to 
the asymptotic result. If the number of samples is below a certain threshold, the estimator 
is biased. It becomes unbiased only for sufficiently large sample sizes. In the appropriate 
scaling limit, the change of behavior corresponds to a phase transition in variants of the 
random energy model, a simplified model for spin glasses. 

This phenomenon can be analyzed from an alternative point of view. Assuming that the 
resources available are limited and that the number of measurements M is fixed, the change 
of behavior in the estimator appears as the size of the system is increased. For small systems, 
the thermodynamic description needs to be understood in terms of ensemble averages, not 
of individual measurements. In this regime, the Jarzynski estimator is unbiased. It is equal 
to the exponential of minus the free energy difference in an isothermal evolution between the 
initial and the final values of the work parameter, independently of other characteristics of 
the actual nonequilibrium process that takes place in the system. In particular, it does not 
depend on the final state of the system after the external manipulation is completed. The 
lack of sensitivity to the details of the manipulation is a striking refiection at the microscopic 
level of the Hamiltonian evolution of the system as a whole (including the degrees of freedom 
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of the environment) as demonstrated in [32]. The mechanical character of the equahty is 
also highhghted by the fact that the largest contributions to the average correspond to 
initial configurations of the system that are typical of the reverse process [18]. Since these 
configurations need to be sufficiently well sampled, this implies that, under these conditions, 
the arrow of time is only tenuously defined. 

For large systems and fixed M, the Jarzynski estimator becomes biased. In fact, as the 
size of the system approaches infinity, the average of the exponential of minus the work val- 
ues obtained in the nonequilibrium process approaches the exponential of minus the average 
work. Since the typical and the average are close to each other, a single experiment and 
an average over a number of experiments of the order of M yield results that are equiva- 
lent within measurement errors. Furthermore, this average depends on the details of the 
external manipulation. These features are in agreement with the standard thermodynamic 
description of an individual macroscopic system: the dominant configurations, which are 
rare for the initial state but typical of the final equilibrium state, are never sampled. The 
sample mean is dominated by typical configurations, which are characteristic of the forward 
process. Therefore, in this regime, the arrow of time becomes well defined. 

The necessary condition to observe a transition in the Jarzynski estimator is that <W>, 
the average work in the nonequilibrium process, be different from AF, the isothermal free- 
energy change. If the non-equilibrium process is adiabatic, these quantities are different even 
when the external manipulation of the system is slow. This can be seen in section IV C, 
which analyzes a quasi-static adiabatic volume change in a dilute gas. Another example is 
the adiabatic expansion of an ideal gas against a piston: The maximum bias B^ax =< W > 
—AF is diff'erent from zero even when the velocity of the piston approaches (see Figure 4 
in [33]). The situation is different for isothermal processes. In this case <W> approaches 
AF in the limit of quasi-static driving. Therefore, in an isothermal process, one should 
expect a transition from a fast driving regime, in which the Jarzynski estimator of the free 
energy difference is biased, to a slow driving regime, in which the Jarzynski estimator is 
unbiased. Isothermal processes are currently under analysis. 

The conclusions of this study are, of course, not new. The emergence of irreversibility 
and of a thermodynamic description in systems with many degrees of freedom is one of 
the central results of statistical mechanics [1-4, 46]. The analysis carried out shows how, 
in the context of the Jarzynski equality, the emergence of this macroscopic picture can be 
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understood in terms of a phase transition that appears in the proper scahng hmit and when 
the measurement resources are hmited as the size of the system increases. 



Appendix A: The random energy model 

The random energy model (REM) was introduced in [20, 21] as a simphfied model for spin 
glasses that captures many salient properties of these types of disordered systems. In the 
random energy model, the system has M — 2^ energy levels, {Ej}f^^. These energy levels 
are independent identically distributed random variables sampled from a normal distribution 

p{E) = ^e-^y^. (Al) 

The canonical partition function for a particular system (i.e. for a particular realization of 

the M energy levels) at temperature is 

ZM(/3)^5]e-'^^^ (A2) 

i=l 

In the limit of large X — > oo, the system undergoes a second order phase transition 

1 f ?+log2, /3</3c 

where j3c — 2-\/log2. 

To make the connection between the Gaussian REM and the empirical estimation of the 
free energy in a process in which the work distribution is Gaussian, ~ N {(W) , cr^) we 
make the change of variables 



'2 105-2 



m 



where we have used the fact that K = logM/ log 2. In terms of these new variables 

M M 

= = ^^'"^ E e-^""-. (A6) 



M ^ M 

m=l m=l 
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Identifying ^/^fffa with p in the REM, we obtain 



E [AFm] = -E [log (e-'^)^] = {W) + logM - E 



M 

rCrErr 



/ X > _ / 21og2 

log 2^ e V logM^ 



<m=l 



(A6) 



{W) - V2ay/]ogM + \ogM, M < exp {^a^} 
for M ^ oo and cr^ — >■ oo with log M/a^ constant. 



Appendix B: The random energy model with energy levels that follow a gamma 
distribution 

Consider a system with M random energy levels, {Ei}f^^, independently sampled from 



a gamma distribution 

= ^^±^|pe-(^-^^), Ee[-K,oo). (Bl) 

Define the parameter ^ = k\(!^2 ■ terms of this parameter M — 2^^. It is possible 
to derive an expression for the entropy in the limit X — > oo by analyzing the behavior of 

A/'(e, e + 6), the number of energy levels in an interval X(e; S) = [Ke, K{e + 5)], with e > — 1, 
5 > [38]. Since this count depends on the realization of the system, A/'(e, e + 5) is a binomial 
random variable whose first two moments are 

E [AA(e, e + 6)] = 2«^Px(e; 6) (B2) 
Var [N{e, e + 5)] = 2«^Px(e; 5) (1 - Px(e; 8)) , (B3) 

where 

is the probability of an individual energy level to be in the interval I{e;S). As X — > oo these 
moments can be approximated to leading exponential order as 

E [7V(e, e + 6)]= exp \ K max s„(x) i (B5) 

with 

=^log2 + log(l + x) -X. (B7) 
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Note that Sa{x) > ior xi < x < Xu, where —l<xi<0<Xu fulfill Sa{xi) — Sa{xu) — 0. 
The limiting behavior of this equation when ^ — > is 

=eiog2-xV2 xi<x<Xu, (B8) 

where xi ~ — -\/2^ log 2 and Xu ~ -\/2|Tog2. In this limit the results are similar to the 
Gaussian REM. In the opposite hmit, when ^ — > oo is 

Sa{x) ^ ^\og2 + log{l + x), xKiXi, (B9) 

where a;^ ~ — 1 + 2^^. 

The entropy function is defined as 

s„(a;) = ^log2 + log(l + a;) -x, xi < x < Xu 
s{e) = { (BIO) 

— OO, X < Xi,X > Xu- 

It can be shown that for any pair e, 5, with probability one, 

lim ^\ogJ\f{e,e + 5) ^ sup s{x). (BH) 

The canonical partition function for a particular realization of the M energy levels at 
temperature is 

^M(/3) = ^e-^^^ (B12) 

i=l 

In the limit K ^ oo, 



K max {sa{x) — I3x) 



(B13) 



rxu 

Zm{P) = / exp [K {sa{x) — /3x)] dx = exp 

J XI 

to exponential accuracy. Depending on the temperature, the maximum is either in the 
interval (a;;,0) (high temperature) or at xi (low temperature) 

^^Z^^xe[xi,xu]Mx) - Px) = 'I , (B14) 

[Xl, P> Pc 

where f3c — —xi/{l+xi). A graphical construction that illustrates this transition is presented 
in figure (5) for ^ = 1. The curves displayed are Sa{x) and straight lines with slope /3 that are 
tangent to Sa{x) at the points that are local maxima of Sa{x) — ^x. For high temperatures 
< Pc) the local maximum is within the interval [x;,^^] and is therefore the solution of 
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FIG. 5. Transition between the low and high temperature regimes in the random energy model 
with energy levels sampled from a Gamma distribution (see text for details). 



(B14). At low temperatures {(3 > /3c), the solution to (B14) is xi, which, in this regime, is 
a global maximum, but not a local one. Using these relations it can be shown that, in this 
limit, the system undergoes a second order phase transition 

r 7 ^mi /^log2 + /3-/o^7(l + /3), /3</3. 

hm —E [log ZM{f3)]= < . B15 

Continuity at (3^ implies that 

%(l + /3,)-^^ = eiog2. (B16) 
Consider now a sample of M = 2^^ work values {W^m}m=i froi^ the distribution 
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with K — Nd/2. The connection with the variant of REM analyzed in this section is 
achieved by making the change of variable 



(B18) 



where (yV) = N^a is the average work and Em are iidrvs sampled from the distribution 
(Bl). In terms of these new variables 



M 



1 



M 



m=l m=l 

Identifying a with [5 in the random energy model, we conclude that there is an abrupt 
change of behavior of the Jarzynski estimator of the free energy differences for M — >■ oo, 
N ^ oo with log M/N constant, as a function of ^ = Ndfo^2 

M 



E [AFm] = -E [log (e-^)^] = (W) + log M - E 



logj]' 

m=l 



,-aEr, 



N^log{l + a), 



Nla{l + xi{0)+logM, e<ec 



where 



log(l + a) 



a 
1+a 



log 2 



and xi{^) is the negative solution of the nonlinear equation 



(B20) 



(B21) 



xiiO - log(l + xiiO) = eiog2, -1 < xiiO < 0. 



(B22) 



At the transition point C — Cc 



a 



(B23) 



1 + a 

For a fixed number of particles N, the transition takes place when the number of measure- 
ments is above the threshold 



Mc = exp 



N-hog(l + a)- — 



(B24) 



Alternatively, for fixed number of experiments M, the transition occurs when the number 
of particles in the gas is below 

21ogM 



(B25) 
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Appendix C: The discrete random energy model 



Consider a system with M — 2^ energy levels, {E'j}^^. These energy levels are indepen- 
dent identically distributed random variables sampled from a binomial distribution 



p(E) 



N 



2^ \lN + E 



N N N 



(CI) 



The canonical partition function at temperature /3 ^ is 



i=l 



In the limit 



M— >-oo N ^ oo, 7 = — = -TT-, — - — >■ constant 



N iVlog2 

the system undergoes a second order phase transition [24] 



log E' 



i=l 



/s:iog2 + iVlogcosh|, /3</3c 



i/3iVtanhf , 



/3>l3c 



where 



OO, 



7 > 1 



log [1 - h^\l - 7)] - log [h^\l - 7)] , 7 < 1 
and the function h2^{y) e [0, 1/2] is the inverse of the binary entropy 

h2{x) = -xlogax- (1 -a;)log2(l -x). 



(C2) 



(C3) 



(C4) 



(C5) 



(C6) 



To make the connection with the ideal gas compression experiment in the calculation of 



E[AFm] = -E [log(e-^>J =-E 



log ( -J- y e-'-'M 



(C7) 



where rim £ {0, 1, ■ ■ -N} follows a binomial distribution of parameters N, p(0) = 1/2, we 
make the change of variable 

1 



, N N N 
nm = -N + Em] E^e '^- — .-— + l,...— 



m 



Using these new random variables 
E [AFm] = -E 



log (e--2^^—ye-^A 

/ M 

log E 



iV- + log M - E 



\rn=\ 



(C9) 
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Finally, identifying e with ^ in the DREM, and ec = 

, N\li- log cosh |1 = A^log Ti^, e < Cc 
[7log2 + I (1 -tanhf)] , e > 

where we have used that fact that K = log M/ log 2 and 7 = K/N = ^^g. 
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